Individual Assignmnet | CSC 3303 | Big Data Analytics

Introduction

COVID-19 has already had an immense effect globally, and million of people are contaminated in more than 218 nations. Here I tried to analysis the covid cases in Malaysia. This data set contains data of Covid Cases in different states. Where stated the death rate, growth of active cases in differnt state. This dataset is collected for the purpose of creating better visualizations for the COVID-19 cases in Malaysia according to the states.

Data Science Question

Necessary Library and Packages

Data Directory Setup.

Reading Datasets and Exploring

Check whether data successfully read or not!

Data Cleaning & Pre-processing

As we can see three column has nul value. So I've converted all null values for 2 columnn to zero. And drop the wanted column.

Here is the short data summary. And I have used a formula to find the "Active Cases". So the formula is quite simple. Active Cases = Cumulative Case - (Cumulative Recovered + Cumulative Death)

Visualizatiton

Necessary Library for visualization purposes

From the above graphs we can see the most new cases got in 30th january 2021, Which is exactly 5728. On the other hand, most recovered cases was in 17 February 2021, Which is around 57 hundred. In contrast, Most death case was in 18th February that was 25 cases.

This graphs represent the total growth rate covid 19 where it crearly stated the Cumulative Cases, Cumulative Recover, New death per day. As we can see the growth of total cases and recover cases have gradually increased in parallel way. On the contrary, the growth of total death cases approximately on the bottom line.

This figure 5 shows the monthly cumulative active cases, death cases and recovered case from march 2020 to April 2021. Where we can see the growth of transmissing weere continiously increasing after Septembar 2020. And pepole mostly affected by covid 19 on january and February. But it is noticeable that the recovery rate was higher than the death rate.

This figure 6 shows the growth rate of Covid 19 cases among the States. As we can see the trasmission of covid 19 gradually increased and after the first month of 2021 the transmission is decreased. But In state Sarawak the transmission was in steady mood.

Figure 7 Shows the total Covid cases by April 15, 2021. From this graphs, we can clearly state that Selangor is the most affected by Covid 19. The cumulative cases for selangor is approximately 121817 Person. And lowest transmission was in state Perlis.

In figure 9 shows the transmission of covid 19 among the 4 state. However, from the figure 9 we can see that Kuala Lampur and the selangor has the faster transmission between the other two.

Conclusion

To sum up, The purpose of this analysis is to estimate the transmission rate of new Covid-19 cases in future on the upcoming days. If we see the statistics, we can see the increment of COVID-19 is significantly rising. By using the dataset of Malaysia covid-19 cases, selecting a some state for calculation and analisation puposes, I came up with some acceptable results which are interesting as well. Although this is not the 100% accurate analysis because there is a shortage of resource and data. However, for my analysis I just give a concept using some sort of prediction and calculation, which could help people in order to maintain a safe and healthy lifestyle in the future.

References